Creating multiple plots for one genome such as heatmap, histgram, line map and so on.
-Input files:
Collecting all file in the current dir: ~/app/circos
i): karyotype.genome.txt format is : "chr - ID LABEL START END COLOR"
chr - aco1 1 0 37743429 rdylbu-11-div-1
Here the color of chromosome at the last col of karyotype.genomeA.txt file, we can serch color on the web: http://colorbrewer2.org/
The format of color is: Palette-Numcolors-Type-Idx
:
: name of color system like : gnbu
: number of data classes, eg: 3,4,5,6,11
: = nature of your data : div~(diverging), seq~(sequential), qual~(qualitative)
: cchoosing which index of color to out of Numcolors.
ii): creating etc foldr under ~/app/circos to contain all other configures files, like links.conf and ticks.con
iii): creating ticks.conf in etc folder to input all parameters for plotting ticks as:
# this means the code starts from the next line
show_ticks = yes
show_tick_labels = no
<ticks>
radius = 1r
color = black
thickness = 2p
multiplier = 1e-6
format = %d # %d
<tick>
spacing = 1u
size = 5p
</tick>
<tick>
thickness = 4p
spacing = 5u
size = 10p
show_label = yes
label_size = 10p
label_offset = 10p
format = %d
</tick>
</ticks>
Creating circos.conf to contain all aprameters and include ticks.conf by adding "<<include ./etc/ticks.conf>>":
show_scatter = yes
show_line = yes
show_histogram = yes
show_heatmaps* = yes
show_tile = yes
show_highlight = yes
use_rules = yes
#specify input file that contains length of each chromosome of genome
karyotype = karyotype.acorus.txt
#specify the order of chromosome
chromosomes_order = aco1,aco2,aco3,aco4,aco5,aco6,aco7,aco8,aco9,aco10,aco11,aco12
chromosomes_units = 1000000
#chromosomes_reverse = /aco/ #reverse
#chromosomes_scale = /aco/=0.7rn,/anc/=0.3rn
#chromosomes_display_default = no
#chromosomes = /aco/;anc1
#chromosomes_radius = /aco/:0.8r
chromosomes_color = aco1=rdylgn-11-div-1,aco2=rdylgn-11-div-2,aco3=rdylgn-11-div-3,aco4=rdylgn-11-div-4,aco5=rdylgn-11-div-5,aco6=rdylgn-11-div-6,aco7=rdylgn-11-div-7,aco8=rdylgn-11-div-8,aco9=rdylgn-11-div-9,aco10=rdylgn-11-div-10,aco11=rdylgn-11-div-11,aco12=[#00441b]
<<include ./etc/ticks.conf>>
<ideogram>
<spacing>
default = 1u
</spacing>
radius = 0.90r
thickness = 20p
fill = yes
stroke_color = dgrey
stroke_thickness = 2p
show_label = yes #show label
label_font = default #font
label_radius = dims(ideogram,radius) + 0.08r #location
label_size = 16 # font size
label_parallel = yes
label_format = eval(sprintf("%s",var(chr))) # format
</ideogram>
<<include ./etc/links.conf>>
<image>
dir* = . # output dir
file* = Acorus.circos.png
radius* = 400p # ridus of pic
svg* = no # whether generating svg file
#angle_offset* = -55
<<include etc/image.conf>>
</image>
<plots>
<plot>
type = heatmap
file = genes_num.txt
color = blues-9-seq
r1 = 0.70r
r0 = 0.61r
</plot>
<plot>
type = scatter
fill_color = black
stroke_color = black
glyph = circle
glyph_size = 5
file = genes_num.txt
r1 = 0.80r
r0 = 0.71r
</plot>
<plot>
type = histogram
file = genes_num.txt
fill_color = blue
r1 = 0.89r
r0 = 0.81r
</plot>
</plots>
<<include etc/colors_fonts_patterns.conf>>
<<include etc/housekeeping.conf>>
iv): Preparing input file "genes_num.txt" for heatmap, scatter, line and histpgram
circos requires the format of input format is as chr start end value [options]
which is 4 col of BED file in fact.
First of all, we need to install bedtools by conda install -c bioconda bedtools
. (if path is not included in mac than we need to add it)
I just choose first 3 cols (chr, strat, end) of genome's BED after using jcvi as "genes.bed" and be careful that the first col of genes.bed is coresponding to the ID col of karyotype.genomeA.txt file. (if this is not coreponding, then we will get number zero on the last col of "genes_num.txt" file) For example, in genes.bed which should look like: aco1 1526 2561
. After that we use bedtool to create windows with setting 500kb by the follwing command:
cut -d ' ' -f 3,6 karyotype.genomeA.txt | tr ' ' '\t' > genomeA.genome
bedtools makewindows -g genomeA.genome -w 500000 > genomeA.windows
Last step to collect info:
bedtools coverage -a genomeA.windows -b genes.bed | cut -f 1-4 > genes_num.txt
Therefor, genes_num.txt is our input file.
v): Eventually, we run the implement the command:
circos -conf circos.conf
we will get a picture with multiple plots for genome A
reference link :
https://www.jianshu.com/p/ea3a8933ace9
Plotting synteny blocks between two different genomes
Except for all files above, we need to create a link.conf under etc folder to contain all parameters for plotting and a link txt file including all syntenys under circos folder.
i): the format of link txt file is at least 6 cols as chr1 start1 end1 chr2 start2 end2 [options]
Here we need to use JCVI to get anc.aco.anchors.simple
file and than we have to replace the genename of anchor file as the start and end loction info which is the links input format of circos. For that, we use simple2links.py
from GengzhouXu's github https://github.com/xuzhougeng/myscripts
simple2links.py anc.aco.anchors.simple
So we get an output files called anc.aco.anchors.simple_link.txt
But we need to chnage the chr names of the two genomes anc and aco as "anc" + "chr", "aco" + "chr" that is coresponding to the karyotype = karyotype.anc.txt,karyotype.aco.txt
from circos.conf
ii): links.conf
<links>
<link>
file = ancestor1.acorus.anchors.simple_link.txt
radius = 0.65r
color = blue_a4
ribbon = yes
bezier_radius = 0r
<rules>
<rule>#rule here is setting different colors for links on conditions
condition = var(chr1) eq "anc1"
color=rdylgn-7-div-1
</rule>
<rule>
condition = var(chr1) eq "anc2"
color=rdylgn-7-div-2
</rule>
<rule>
condition = var(chr1) eq "anc3"
color=rdylgn-7-div-3
</rule>
<rule>
condition = var(chr1) eq "anc4"
color=165,105,189
</rule>
<rule>
condition = var(chr1) eq "anc5"
color=rdylgn-7-div-5
</rule>
<rule>
condition = var(chr1) eq "anc6"
color=rdylgn-7-div-6
</rule>
<rule>
condition = var(chr1) eq "anc7"
color=rdylgn-7-div-7
</rule>
</rules>
</link>
</links>
iii): create a new circos.conf
in the current circos folder as:
karyotype = karyotype.ancestor1_resize.txt,karyotype.acorus.txt
chromosomes_order = aco12,aco11,aco10,aco1,aco4,aco9,aco8,aco7,aco2,aco6,aco3,aco5
chromosomes_units = 1000000# this is the unit of chr, 1000000 = 1000000kb = 1u
chromosomes_reverse = /aco/
chromosomes_scale = /aco/=0.7rn,/anc/=0.3rn
#chromosomes_display_default = no
#chromosomes = /aco/;anc1
chromosomes_radius = /anc/:0.8r
<<include ./etc/ticks.conf>> #####this command inlude the ticks.conf under etc folder
<ideogram>
<spacing>
default = 1u
#to get the two genomes symmetric outlet, aco1 face anc1, aco12 face anc7
<pairwise aco1 anc1>
spacing = 10u
</pairwise>
<pairwise aco12 anc7>
spacing = 10u
</pairwise>
</spacing>
radius = 0.90r
thickness = 20p
fill = yes
stroke_color = dgrey
stroke_thickness = 2p
show_label = yes #show label
label_font = default
label_radius = dims(ideogram,radius) + 0.08r #
label_size = 16 #
label_parallel = yes #
label_format = eval(sprintf("%s",var(chr))) #
</ideogram>
<<include ./etc/links.conf>> ##### this ocmmand is using links.conf under folder of etc
<image>
dir* = . # output dir, * means we can update this value of the variable
radius* = 400p #
svg* = no #
angle_offset* = -55
<<include etc/image.conf>>
</image>
<<include etc/colors_fonts_patterns.conf>>
<<include etc/housekeeping.conf>>
The last step run command circos -conf circos.conf
Reference link:https://www.jianshu.com/p/1658e702ba17.
We can split chr as several part by : http://www.360doc.com/content/19/1224/13/68068867_881784578.shtml .
Or check my folder ~/circos/Acorus_anc1/newchr1_aco